A Keyword Based Prototype for Web Search Result Diversification

نویسندگان

  • Gu-Li Lin
  • Hong Peng
  • Qian-Li Ma
  • Jia Wei
  • Jiang-Wei Qin
چکیده

In web search scenario, users often submit short query terms to search engines, expecting to find their desired information in top ranked results. But their queries are so ambiguous that their actual information needs are often unspecified. To satisfy the different information needs, an effective approach is to diversify the top results retrieved for the query. In this paper, we reduce the diversification problem into optimizing the maximum coverage of information facets related to the query, and introduce KED, a novel keyword based prototype for web search result diversification that provides a diverse ranking by selecting documents to cover keywords which belong to different facets underlying the retrieved documents. We evaluated the effectiveness of KED using two public test collections with different kinds of documents. The experiment results show that KED can stably outperform other existing implicit diversification approaches in promoting diversity of top ranked results. Moreover, we show that its effectiveness can be further improved by using high quality keywords.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Survey on Keyword Diversification Over XML Data

Keyword queries are those terms that users enter and use to retrieve documents that have all or any of those terms. They are the most familiar and popular method used by ordinary users to search data. Keyword queries are highly ambiguous. Keyword search querying has emerged as one of the most effective way for information discovery, especially over HTML documents in the World Wide Web. Because ...

متن کامل

Expert Discovery: A web mining approach

Expert discovery is a quest in search of finding an answer to a question: “Who is the best expert of a specific subject in a particular domain within peculiar array of parameters?” Expert with domain knowledge in any field is crucial for consulting in industry, academia and scientific community. Aim of this study is to address the issues for expert-finding task in real-world community. Collabor...

متن کامل

Diversified Spatial Keyword Search On Road Networks

With the increasing pervasiveness of the geo-positioning technologies, there is an enormous amount of spatio-textual objects available in many applications such as location based services and social networks. Consequently, various types of spatial keyword searches which explore both locations and textual descriptions of the objects have been intensively studied by the research communities and c...

متن کامل

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

Efficient Verification of Web-Content Searching Through Authenticated Web Crawlers

We consider the problem of verifying the correctness and completeness of the result of a keyword search. We introduce the concept of an authenticated web crawler and present its design and prototype implementation. An authenticated web crawler is a trusted program that computes a speciallycrafted signature over the web contents it visits. This signature enables (i) the verification of common In...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Inf. Sci. Eng.

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2012